DaNetQA: A Yes/No Question Answering Dataset for the Russian Language

نویسندگان

چکیده

DaNetQA, a new question-answering corpus, follows BoolQ [2] design: it comprises natural yes/no questions. Each question is paired with paragraph from Wikipedia and an answer, derived the paragraph. The task to take both as input come up i.e. produce binary output. In this paper, we present reproducible approach DaNetQA creation investigate transfer learning methods for language transferring. For transferring leverage three similar sentence modelling tasks: 1) corpus of paraphrases, Paraphraser, 2) NLI task, which use Russian part XNLI, 3) another answering SberQUAD. English translation together multilingual fine-tuning.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Question Answering on the SQuAD Dataset

We develop a deep learning framework for question answering on the Stanford Question Answering Dataset (SQuAD), blending ideas from existing state-of-theart models to achieve results that surpass the original logistic regression baselines. Using a dynamic coattention encoder and an LSTM decoder, we achieved an F1 score of 55.9% on the hidden SQuAD test set. In this paper, we present the methodo...

متن کامل

Question Answering System for the French Language

This paper describes our first participation in the QA@CLEF monolingual and bilingual task, where our objective was to propose a question answering system designed to respond to French queries submitted to search French documents. We wanted to combine a classic information retrieval model (based on the Okapi probabilistic model) with a linguistic approach based mainly on syntactic analysis. In ...

متن کامل

SQuAD Question Answering Dataset: CS224N Assn 4

We solve the contextual question answering problem, which is an essential part in many automated question-answering datasets. Recently the SQuAD dataset [1] was uploaded and there were several deep learning approaches proposed to solve this. We implement a modified version of one of them, the Dynamic Coattention model as well as simple baseline.

متن کامل

An Exploration of Approaches for the Stanford Question Answering Dataset

In this paper, we present an exploration of several approaches of varying complexity, novelty and effectiveness applied to solving the reading comprehension problem evaluated by the Stanford Question Answering Dataset (SQuAD), taken from the perspective of a novice practitioner of Deep Neural Networks for Natural Language Processing. Here, we present several models, their mathematical structure...

متن کامل

Language Independent Passage Retrieval for Question Answering

Passage Retrieval (PR) is typically used as the first step in current Question Answering (QA) systems. Most methods are based on the vector space model allowing the finding of relevant passages for general user needs, but failing on selecting pertinent passages for specific user questions. This paper describes a simple PR method specially suited for the QA task. This method considers the struct...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-72610-2_4